# Scaling Co-Packaged Optical Interconnects Using Hybrid 2.5D/3D Integration

Austin Rovinski,<sup>1,3</sup> Yanghui Ou,<sup>1</sup> Christine Ou,<sup>1</sup> Devesh Khilwani,<sup>1</sup> Yuyang Wang,<sup>2</sup> Songli Wang,<sup>2</sup> Sunwoo Lee,<sup>4</sup> Keren Bergman,<sup>2</sup> Alyosha Molnar,<sup>1</sup> Christopher Batten<sup>1</sup>

<sup>1</sup>School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, yo96@cornell.edu
<sup>2</sup>Department of Electrical Engineering, Columbia University, New York, NY, yw3831@columbia.edu
<sup>3</sup>Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, rovinski@nyu.edu
<sup>4</sup>Department of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, sunwoo.lee@ntu.edu.sg

Abstract—Tightly integrated optical interconnects can provide high-bandwidth, energy-efficient inter-node communication. We describe a novel system which uses hybrid 2.5D/3D integration to compose a state-of-the-art FPGA compute chiplet, three electrical interface chiplets, and three photonic interface chiplets. We use register-transfer-, gate-, transistor-, and device-level simulations to demonstrate the potential for this system to achieve 96 Tb/s of bi-directional bandwidth, and we experimentally demonstrate key components including a complete opto-electrical channel. Our results provide a strong case for hybrid 2.5D/3D integration as the key enabler for scaling co-packaged optical interconnects.

## I. INTRODUCTION

Modern data-center and high-performance computing work-loads are increasingly limited by inter-node communication overheads. This has motivated the use of inter-node optical interconnects to enable longer reach, higher bandwidth, and lower energy compared to equivalent electrical interconnects [5]. State-of-the-art systems use fiber-optic cables that are connected to compute boards through pluggable optical transceiver modules, which are then connected to the compute package through board-level electrical interconnects. Unfortunately, this final step of the electrical interconnect is a significant bandwidth and energy bottleneck. Tightly integrated optical interconnects promise to overcome this bottleneck by directly attaching fiber-optic cables to the compute package itself.

Early work proposed monolithic integration, where optical devices are directly integrated into the compute die [1, 2, 17, 21, 22] (see Fig. 1(a)). However, this approach has yet to see widespread adoption as the optimal process for electronics is often sub-optimal for optics and vice-versa. An alternative approach uses 2.5D integration where an electrical compute chiplet and opto-electrical chiplet (with electrical transceivers and optical devices) are co-packaged on an interposer or with an embedded silicon bridge [9, 10] (see Fig. 1(b)). This approach allows the electrical compute chiplet to be fabricated in an advanced technology node, but designers again face a difficult trade-off in optimizing the opto-electrical chiplet. 3D integration involves an electrical compute chiplet (with compute logic and electrical transceivers) and an optical device chiplet directly integrated into a 3D stack [3, 4, 6-8, 13, 18] (see Fig. 1(c)). This approach not only requires more sophisticated packaging, but also requires fixing the integration of compute logic and transceivers at design time. Hybrid 2.5D/3D integration offers a compelling compromise: 3D integration stacks an optimized optical device chiplet with an optimized electrical transceiver chiplet and then 2.5D integration packages this 3D stack with an optimized electrical compute chiplet (see Fig. 1(d)). While prior



Fig. 1. Approaches to Tightly Integrated Optical Interconnects



Fig. 2. Hybrid 2.5D/3D Packaged System

work has proposed hybrid 2.5D/3D integration for co-packaged optics [3, 4], no prior work has experimentally demonstrated a complete system combining a state-of-the-art electrical compute chiplet, electrical transceiver chiplet, and optical device chiplet using hybrid 2.5D/3D integration.

This paper describes a novel system which uses hybrid 2.5D/3D integration to compose an Intel Stratix 10 FPGA chiplet fabricated on an advanced technology node, three electrical interface chiplets (EIC) fabricated on Intel 16 nm, and three photonic interface chiplets (PIC) fabricated through AIM Photonics (see Fig. 2). An EIC and PIC are stacked using 3D integration based on  $55\,\mu$ m-pitch  $\mu$ bumps, and each EIC/PIC stack is integrated with the FPGA chiplet using using 2.5D integration based on  $45\,\mu$ m-pitch  $\mu$ bumps and Intel's Embedded Multi-Die Interconnect Bridge (EMIB) technology. Unlike Fig. 1(d), the system presented in this paper positions the EIC on top of the PIC. The underlying design philosophy is scaling to many more channels at modest data rates will be particularly advantageous for achieving both high aggregate band-

width and high energy efficiency [23, 24]. Given this motivation, each EIC/PIC stack implements 1,024 optical channels in each direction with 64 channels wavelength-division multiplexed (WDM) onto a single fiber. The channels are designed to potentially operate at up to 16 Gb/s leading to a peak bidirectional optical bandwidth of 32 Tb/s per EIC/PIC stack and 96 Tb/s aggregated across the system. We have conducted a rigorous simulation-based study using register-transfer-, gate-, transistor-, and device-level models, and we have conducted a functional demonstration of the system in-lab including validating a complete opto-electrical channel from EIC through PIC.

Our main contributions are: (1) a detailed description of a novel FPGA/EIC/PIC system which uses hybrid 2.5D/3D integration including discussion of key techniques for overcoming scaling challenges in both the EIC and PIC; (2) simulation-based evaluation demonstrating the potential for this system to achieve 96 Tb/s of bi-directional optical bandwidth; and (3) experimental demonstration of the key components of this system including a complete opto-electrical channel.

#### II. SYSTEM DESIGN

Fig. 2 illustrates the three types of chiplets in the system. We use an Intel Stratix 10 FPGA chiplet with 1.3M logic elements, 2.5K digital signal processing (DSP) blocks, and 114 Mb on-chip memory. The FPGA chiplet connects to three separate EICs through Intel's EMIB and an Advanced Interface Bus (AIB) interface. Each EIC is 8×8 mm with 1.4K C4 bumps and 13K μbumps. Each EIC is flip-chip bonded to a PIC. Each PIC is 8.6×8.1 mm with 10K μbumps and is directly attached to an optical fiber array.

## A. EIC Architecture

Fig. 3 shows the EIC die photo, and Fig. 4 illustrates the three major EIC blocks: AIB, crossbar, and transceivers (TRX).

1) AIB: The AIB is a commercially available physical-layer IP that implements the AIB protocol [12]. The AIB connects 24 channels to a crossbar which operates at 500 MHz. Each AIB channel can potentially support up to 40 Gb/s in AIB 1.0 mode or 160 Gb/s in AIB 2.0 mode, although only AIB 1.0 mode is enabled in this system for a total potential AIB bandwidth of 960 Gb/s per EIC.

2) Crossbar: The crossbar routes the AIB channels to the TRX channels and hosts a majority of the control and test infrastructure for the EIC. The crossbar integrates 48 AIB interface macros, 1024 TRX interface macros, and 48 32-bit channels at 500 MHz which connect the AIB interface macros to 48 of the TRX interface macros. Each interface macro is equipped with programmable scan chains, pseudorandom binary sequence (PRBS) generators and verifiers, and fixed pattern generators and verifiers. Both AIB and TRX interface macros can be configured to either pass through or loop back the received data.

3) TRX: The TRX is designed for high-density 3D integration with the PIC and is similar to the architecture from Khilwani et al. [13]. The TRX comprises four TRX groups, each containing 256 TRX cells. Each TRX cell includes a transmitter (TX) path that transmits data from the crossbar to the PIC, and a receiver (RX) path that receives data from the PIC and forwards it to



Fig. 3. EIC and PIC Die Photos



Fig. 4. EIC Architecture Block Diagram

the crossbar. Both paths are designed to provide 16 Gb/s bandwidth per channel, resulting in an aggregate TRX bandwidth of 16 Tb/s in each direction.

The TX path consists of a 128-bit buffer, serializer, and level-shifting driver. Similarly, the RX path includes an offset DAC, analog front end (AFE), deserializer, and 128-bit buffer. The 128-bit buffers form a mesochronous interface for synchronization between the TRX and crossbar. The TRX/crossbar interface supports 32 bits per channel at 500 MHz and two test modes. The TX test mode provides a scan chain interface to the 128-bit buffer, and the RX test mode enables alternating 32-bit patterns in place of streaming data from the PIC.

Unlike prior work [13], our architecture: (1) uses a heater DAC with linear power output to enable simplified tuning and (2) integrates active TX/RX re-calibration circuitry for long-term operation. By measuring the amplitude of the AFE signal, we create a control feedback loop to the heater DACs which keeps the modulators and filters at the appropriate resonance. In the case of the RX, the calibration also adjusts the offset DAC which keeps the received signal centered at mid-rail.

## B. EIC Scaling Challenges

Although prior EIC implementations have demonstrated up to 100 optical channels [3, 4, 6–10, 13], scaling to 1024 optical channels raises new physical design challenges with respect to routing and clocking.

- 1) Routing distance: Routing to the crossbar from 16 TRX cells in each TRX row was particularly challenging. The TRX is  $\approx$ 4 mm wide, resulting in the latency from the rightmost TRX cells exceeding the 2 ns clock period of the crossbar. Although only some cells violated timing, we inserted pipeline registers on all signals eight cells away from the signal's originating cell.
- 2) Routing congestion: Each TRX cell interface consists of 64 data bits and 7 control bits, resulting in 1,136 wires at the left edge of each 16-cell TRX row. This caused near-100% horizontal routing track usage and limited scalability of the row size. We used a "swizzle" layout with one signal buffered at a time and then shifted down one position in the bus. This style allowed for regular buffering of a small number of signals at a time due to limited space for downward vias, and it also allowed for a modular layout pattern. The crossbar then effectively needed to mux over tens of thousands of signals from the TRX to thousands of pins on the AIB. Given the large number of signals, we used hierarchical design with AIB interface and TRX interface macros. Routing by abutment was used to simplify top-level routing and timing closure.
- 3) Clock distribution: Although the crossbar and TRX could operate on the same clock, the TRX layout was too dense to allow for clock tree balancing across the array. We instead used mesochronous buffers at the TRX-crossbar interface [14]. No handshake mechanism was needed, because both sides operate at the same frequency, and there is no back-pressure in the physical-level interface.

# C. PIC Architecture

The PIC is co-designed with the EIC for flip-chip bonding. Each TRX cell modulates (TX) and detects (RX) a single wavelength; 64 wavelengths are wavelength-division multiplexed (WDM) onto a single optical link resulting in a total of 16 TX links and 16 RX links arranged into four groups on the PIC (see Fig. 3). Fig. 5 shows the link architecture based on our own prior work [26,27]. At the TX side, we use ring-assisted Mach-Zehnder interferometer RAMZI-based interleavers to subdivide the incoming wavelength channels onto separate buses [25]. Data is modulated onto each wavelength by separate banks of cascaded microdisk modulators, driven by the EIC through the µbumps. The 64 modulated channels are then recombined into a single fiber output. At the RX side, a similar interleaving structure sends the wavelengths onto four buses of cascaded microdisk filters for sensing. The optical devices are designed to



Fig. 5. Photonic Link Architecture (measured link budgets satisfying simulated RX sensitivity)

support each wavelength operating at 16 Gb/s, achieving an aggregate bandwidth of 1 Tb/s per fiber, and thus 16 Tb/s per PIC with a 2 Tb/s/mm shoreline bandwidth density.

### D. PIC Scaling Challenges

Just as for the EIC, scaling to 1024 optical channels raises new PIC design challenges, including managing optical bandwidth, optical losses, process variations, and thermal control.

- 1) Optical bandwidth: Conventional single-bus link architectures struggle to accommodate massive WDM due to the limited free spectral range (FSR) of the microresonators. We adopt a multi-bus link architecture that de-interleaves WDM channels onto multiple buses, as proposed previously [11, 18]. Since each stage of de-interleaving doubles the channel spacing, unwanted resonances are placed between channels with minimal crosstalk. This enables 64 WDM channels spaced at 100 GHz (spanning > 50 nm in C-band) with modulators and filters of a moderate 25.69 nm FSR.
- 2) Optical losses: Massive WDM scaling also reduces the optical power budget per channel. Silicon nonlinearities limit the total optical power per waveguide, and optical losses must be minimized to meet the target receiver sensitivity. We carefully optimize the number of interleaver stages to balance interleaver insertion loss vs. accumulated modulator/filter passing losses. We also adopt custom vertical-junction (VJ) microdisk modulators similar to [16]. The improved depletion response of VJ modulators compared to lateral-junction modulators allows using < 0.8 V CMOS-compatible voltage swings, while still achieving high extinction ratios and thus low power penalties [15]. Chip-fiber coupling loss can also be reduced to < 1 dB per facet by using optimized edge coupler designs, such as [8].
- 3) Process variations and thermal management: Thermal control is challenging in so many optical channels. On-chip thermal control is desirable, but faces limited area due to dense packaging requirements. Off-chip thermal control has more area available, but suffers from limited I/O count and bandwidth to the many devices requiring tuning. We adopt a fabrication-robust platform where wide waveguides are used in certain sections to reduce the sensitivity to process variations [19], while maintaining single-mode operation with specially designed bend geometries. We also explore the use of substrate



Fig. 6. Experimental Setup

undercut around thermally tuned devices, as shown in Fig. 3, which can improve thermal tuning efficiency by at least  $5 \times [20]$ .

## III. EVALUATION

We use simulation-based and experimental results to demonstrate the potential for hybrid 2.5D/3D integration to enable copackaged optical interconnects.

#### A. Simulation-Based Evaluation

We use bare-die measurement of photonic devices to characterize the optical link losses, parasitic resistance/capacitance of the optical modulator and photodetector, drive voltage of the photodetector, and thermal characteristics of the heaters. We build compact models of the PIC devices and use transistor-level modeling of the TRX on the EIC to characterize the EIC to PIC bandwidth, latency, and energy. We build register-transfer-level (RTL) behaviorial models of the TRX and use RTL modeling of the AIB and crossbar along with a target FPGA design to characterize the FPGA to EIC bandwidth, latency, and energy.

Our end-to-end simulations, from the FPGA to the TRX microbumps, validate that the FPGA can sustain 768 Gb/s over the AIB, crossbar, TRX, and PIC by using 48 of the optical channels each operating at 16 Gb/s. This is below the peak AIB 1.0 bandwidth, since two 32-bit EIC channels are mapped to each 80-bit AIB channel. Our simulations also validate that the EIC can sustain 32 Gb/s bi-directional bandwidth over the crossbar, TRX, and PIC by generating PRBS traffic on-chip for every TRX channel. Finally, our device-level characterization validates that we can meet the optical power limits and receiver sensitivity target (-22 dBm for 1E-12 bit error rate at 16 Gb/s). Our simulations show an end-to-end latency of 39 ns from the FPGA to PIC, most of which stems from pipeline and synchronization delays in the crossbar which could potentially be optimized to reduce latency. Our simulation-based energy analysis suggests it should be possible to achieve sub-1 pJ/b including the TRX and PIC with another 1 pJ/b for the AIB and crossbar.

# B. Experimental Evaluation

Fig. 6 illustrates our experimental setup. The optical path has a tunable laser source followed by a thulium-doped fiber



Fig. 7. Eye Diagrams for Opto-Electrical EIC/PIC Channel

amplifier (TDFA) and a polarization controller (PC) before entering the PIC. The modulated optical carrier from the PIC is converted to an electrical signal by a Thorlabs photodetector and inspected by a Keysight oscilloscope. The EIC is configured through an SPI controller to generate a 128-bit repeating fixed pattern or pseudo-random pattern. The 128-bit pattern is sent from the crossbar through the TRX to the PIC for modulation. We successfully demonstrated the FPGA functioning in isolation and demonstrated multiple complete opto-electrical channels from the EIC through the PIC each running at 1 Gb/s (see Fig. 7). However, several implementation issues prevented demonstrating the system's full capabilities. While each optical channel is functional, we were unable to include delay-locked loops in the EIC, meaning the RX clock to data skew is unknown. A timing analysis bug prevented the crossbar from running at the target 500 MHz, and a power routing bug led to unreliable operation of several AIB interface units. A future revision would be able to correct these issues.

## IV. CONCLUSION

In this paper, we presented a novel system which uses hybrid 2.5D/3D integration to compose a state-of-the-art FGPA compute chiplet, three electrical interface chiplets, and three photonic interface chiplets. Our simulation-based evaluation demonstrates the potential for this system to achieve 96 Tb/s of bi-directional optical bandwidth, and our experimental demonstration functionally validates key components including a complete opto-electrical channel. While implementation oversights prevented our experimental system from demonstrating the target system-level bandwidth, this work still shows the potential of hybrid 2.5D/3D integration and serves as an important next step towards scaling co-packaged optical interconnects for inter-node communication.

#### V. ACKNOWLEDGEMENTS

This work was supported in part by DARPA awards HR00111830002 (CHIPS) and HR00111920014 (PIPES), and ARPA-E award DE-AR000843 (ENLITENED). We thank Intel for their contributions and support of this work.

#### REFERENCES

- [1] C. Batten, A. Joshi, J. S. Orcutt, A. Khilo, B. Moss, C. W. Holzwarth, M. A. Popović, H. Li, H. I. Smith, J. L. Hoyt, F. X. Kärtner, R. J. Ram, V. Stojanović, and K. Asanović. Building Manycore Processor-to-DRAM Networks with Monolithic CMOS Silicon Photonics. *IEEE Micro*, 29(4):8–21, Jul/Aug 2009.
- [2] J. F. Buckwalter, X. Zheng, G. Li, K. Raj, and A. V. Krishnamoorthy. A Monolithic 25-Gb/s Transceiver With Photonic Ring Modulators and Ge Detectors in a 130-nm CMOS SOI Process. *IEEE Journal of Solid-State Circuits (JSSC)*, 47(6):1309–1322, Apr 2012.
- [3] P.-H. Chang, A. Samanta, P. Yan, M. Fu, Y. Zhang, M. B. On, A. Kumar, H. Kang, I.-M. Yi, D. Annabattuni, D. Scott, R. Patti, Y.-H. Fan, Y. Zhu, S. Palermo, and S. J. B. Yoo. A 3D Integrated Energy-Efficient Transceiver Realized by Direct Bond Interconnect of Co-Designed 12nm FinFET and Silicon Photonic Integrated Circuits. *Journal of Lightwave Technology (JLT)*, 41(21), Nov 2023.
- [4] Y. Chen, M. Kibune, A. Toda, A. Hayakawa, T. Akiyama, S. Sekiguchi, H. Ebe, N. Imaizumi, T. Akahoshi, S. Akiyama, S. Tanaka, T. Simoyama, K. Morito, T. Yamamoto, T. Mori, Y. Koyanagi, and H. Tamura. A 25Gb/s Hybrid Integrated Silicon Photonic Transceiver in 28nm CMOS and SOI. *Int'l Solid-State Circuits Conf. (ISSCC)*, Feb 2015.
- [5] Q. Cheng, M. Bahadori, M. Glick, S. Rumley, and K. Bergman. Recent Advances in Optical Technologies for Data Centers: A Review. *Optica*, 5(11):1354–1370, Oct 2018.
- [6] S. Daudlin, S. Lee, D. Kilwani, C. Ou, A. Rizzo, S. Wang, M. Cullen, A. Molnar, and K. Bergman. Ultra-dense 3D Integrated 5.3 Tb/s/mm2 80 Micro-Disk Modulator Transmitter. *Optical Fiber Communication Conf.*, Mar 2023.
- [7] S. Daudlin, A. Rizzo, N. C. Abrams, S. Lee, D. Khilwani, V. Murthy, J. Robinson, T. Collier, A. Molnar, and K. Bergman. 3D-Integrated Multichip Module Transceiver for Terabit-Scale DWDM Interconnects. *Optical Fiber Communication Conf.*, Jun 2021.
- [8] S. Daudlin, A. Rizzo, S. Lee, D. Khilwani, C. Ou, S. Wang, A. Novick, V. Gopal, M. Cullen, R. Parsons, A. Molnar, and K. Bergman. 3D Photonics for Ultra-Low Energy, High Bandwidth-Density Chip Data Links. Computing Research Repository (CoRR), arXiv:2310.01615, Oct 2023
- [9] K. Hosseini, E. Kok, S. Y. Shumarayev, C.-P. Chiu, A. Sarkar, A. Toda, Y. Ke, A. Chan, D. Jeong, M. Zhang, S. Raman, T. Tran, K. A. Singh, P. Bhargava, C. Zhang, H. Lu, R. Mahajan, X. Li, N. Deshpande, C. O'Keeffe, T. T. Hoang, U. Krishnamoorthy, C. Sun, R. Meade, V. Stojanović, and M. Wade. 8 Tbps Co-Packaged FPGA and Silicon Photonics Optical IO. Optical Fiber Communication Conf., Feb 2021.
- [10] K. Hosseini, E. Kok, S. Y. Shumarayev, D. Jeong, A. Chan, A. Katzin, S. Liu, R. Roucka, M. Raval, M. Mac, C.-P. Chiu, T. Tran, K. A. Singh, S. Raman, Y. Ke, C. Li, L.-F. Yang, P. Chao, H. Lu, F. Luna, X. Li, T. T. Hoang, A. Sarkar, A. Toda, R. Mahajan, N. Deshpande, C. O'Keeffe, U. Krishnamoorthy, V. Stojanović, C. Madden, C. Zhang, M. Sysak, P. Bhargava, C. Sun, and M. Wade. 5.12 Tbps Co-Packaged FPGA and Silicon Photonics Interconnect I/O. Symp. on VLSI Technology and Circuits (VLSI), Jun 2022.
- [11] A. James, A. Novick, A. Rizzo, R. Parsons, K. Jang, M. Hattink, and K. Bergman. Scaling Comb-Driven Resonator-Based DWDM Silicon Photonic Links to Multi-Tb/s in the Multi-FSR Regime. *Optica*, Jul 2023
- [12] D. Kehlet. Accelerating Innovation Through A Standard Chiplet Interface: The Advanced Interface Bus (AIB). Intel Whitepaper. https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/accelerating-innovation-through-aib-whitepaper.pdf.
- [13] D. Khilwani, S. Lee, C. Ou, S. Daudlin, A. Rizzo, S. Wang, M. Cullen, K. Bergman, and A. Molnar. 3D-Integrated, Low Power, High Bandwidth Density Opto-Electronic Transceiver. *Int'l Conf. on Circuits and Systems (ISCAS)*, May 2024.

- [14] D. Konstantinou, A. Psarras, C. Nicopoulos, and G. Dimitrakopoulos. The Mesochronous Dual-Clock FIFO Buffer. Symp. on VLSI Technology and Circuits (VLSI), Jun 2020.
- [15] A. Novick, A. James, L. Y. Dai, Z. Wu, A. Rizzo, S. Wang, Y. Wang, M. Hattink, V. Gopal, K. Jang, R. Parsons, and K. Bergman. High-Bandwidth Density Silicon Photonic Resonators for Energy-Efficient Optical Interconnects. *Applied Physics Letters*, 10(4):041306, Dec 2023.
- [16] A. Novick, S. Wang, A. Rizzo, V. Gopal, and K. Bergman. Ultra-Efficient Interleaved Vertical-Junction Microdisk Modulator with Integrated Heater. Optical Fiber Communication Conf., Mar 2024.
- [17] J. S. Orcutt, B. Moss, C. Sun, J. Leu, M. Georgas, J. Shainline, E. Zgraggen, H. Li, J. Sun, M. Weaver, S. Urošević, M. Popović, R. J. Ram, and V. Stojanović. Open Foundry Platform for High-Performance Electronic-Photonic Integration. *Optics Express*, 20(11):12222–12232, May 2012.
- [18] A. Rizzo, S. Daudlin, A. Novick, A. James, V. Gopal, V. Murthy, Q. Cheng, B. Y. Kim, X. Ji, Y. Okawachi, M. van Niekerk, V. Deenadayalan, G. Leake, M. Fanto, S. Preble, M. Lipson, A. Gaeta, and K. Bergman. Petabit-Scale Silicon Photonic Interconnects With Integrated Kerr Frequency Combs. *Journal of Selected Topics in Quantum Electronics*, 29(1):3700120, Jan/Feb 2023.
- [19] A. Rizzo, U. Dave, A. Novick, A. Freitas, S. P. Roberts, A. James, M. Lipson, and K. Bergman. Fabrication-Robust Silicon Photonic Devices in Standard Sub-Micron Silicon-on-Insulator Processes. *Optics Letters*, 48(2):215, Jan 2023.
- [20] A. Rizzo, V. Deenadayalan, M. van Niekerk, G. Leake, C. Tison, A. Novick, D. Coleman, K. Bergman, S. Preble, and M. Fanto. Ultra-Efficient Foundry-Fabricated Resonant Modulators with Thermal Undercut. *Conf. on Lasers and Electro-Optics (CLEO)*, May 2023.
- [21] J. C. Rosenberg, W. M. J. Green, J. Proesel, S. Assefa, D. M. Gill, T. Barwicz, S. M. Shank, C. Reinholm, M. Khater, E. Kiewra, S. Kamlapurkar, and Y. A. Vlasov. A Monolithic Microring Transmitter in 90nm SOI CMOS Technology. *Photonics Conference*, Sep 2013.
- [22] C. Sun, M. T. Wade, Y. Lee, J. S. Orcutt, L. Alloatti, M. S. Georgas, A. S. Waterman, J. M. Shainline, R. R. Avizienis, S. Lin, B. R. Moss, R. Kumar, F. Pavanello, A. H. Atabaki, H. M. Cook, A. J. Ou, J. C. Leu, Y.-H. Chen, K. Asanović, R. J. Ram, M. A. Popović, and V. M. Stojanović. Single-Chip Microprocessor that Communicates Directly Using Light. *Nature*, 528(7583):534–538, Dec 2015.
- [23] D. Tonietto. Connecting Switch to Fiber: The Energy Efficiency Challenge. Optical Fiber Communication Conf., Mar 2024.
- [24] W. J. Turner, J. W. Poulton, Y. Nishi, X. Chen, B. Zimmer, S. Song, W. John M, W. J. Dally, and C. T. Gray. Leveraging Micro-Bump Pitch Scaling to Accelerate Interposer Link Bandwidths for Future High-Performance Compute Applications. *Custom Integrated Circuits Conf. (CICC)*, Apr 2024.
- [25] S. Wang, Y. Wang, X. Meng, K. Hosseini, T. T. Hoang, and K. Bergman. Automated Tuning of Ring-Assisted MZI–Based Interleaver for DWDM Systems. *Optical Fiber Communication Conf.*, May 2024.
- [26] Y. Wang, A. Novick, R. Parsons, S. Wang, K. Jang, A. James, M. Hattink, V. Gopal, A. Rizzo, C.-P. Chiu, K. Hosseini, T. T. Hoang, and K. Bergman. Scalable Architecture for Sub-pJ/b Multi-Tbps Comb-Driven DWDM Silicon Photonic Transceiver. Next-Generation Optical Communication: Components, Sub-Systems, and Systems (SPIE OPTO), Mar 2023.
- [27] Y. Wang, S. Wang, R. Parsons, A. Novick, V. Gopal, K. Jang, A. Rizzo, C.-P. Chiu, K. Hosseini, T. T. Hoang, S. Shumarayev, and K. Bergman. Silicon Photonics Chip I/O for Ultra High-Bandwidth and Energy-Efficient Die-to-Die Connectivity. *Custom Integrated Circuits Conf. (CICC)*, Apr 2024.